176 research outputs found

    Subsampling Error in Stochastic Gradient Langevin Diffusions

    Full text link
    The Stochastic Gradient Langevin Dynamics (SGLD) are popularly used to approximate Bayesian posterior distributions in statistical learning procedures with large-scale data. As opposed to many usual Markov chain Monte Carlo (MCMC) algorithms, SGLD is not stationary with respect to the posterior distribution; two sources of error appear: The first error is introduced by an Euler--Maruyama discretisation of a Langevin diffusion process, the second error comes from the data subsampling that enables its use in large-scale data settings. In this work, we consider an idealised version of SGLD to analyse the method's pure subsampling error that we then see as a best-case error for diffusion-based subsampling MCMC methods. Indeed, we introduce and study the Stochastic Gradient Langevin Diffusion (SGLDiff), a continuous-time Markov process that follows the Langevin diffusion corresponding to a data subset and switches this data subset after exponential waiting times. There, we show that the Wasserstein distance between the posterior and the limiting distribution of SGLDiff is bounded above by a fractional power of the mean waiting time. Importantly, this fractional power does not depend on the dimension of the state space. We bring our results into context with other analyses of SGLD

    Association between Prediagnostic Weight Change and Colon Cancer Risk in a Prospective Cohort

    Get PDF
    Background and Objectives: Obesity is defined by World Health Organization (WHO) as a BMI value over 30 and is associated with an increased risk of colon cancer in many studies. Whether weight change during adulthood is related to risk of colon cancer is not clearly related to the risk of cancer. This study not only investigates the relationship between body size at different ages and colorectal cancer risks but also focuses on the effect of weight changes throughout the adult years as related to both gender and stage of life. Design and analysis: A prospective cohort of 15,008 cancer-free people is followed up 1989 through 2007. Cox proportional hazard regression models adjusted for life style risk factors were used to calculate hazard ratios and 95% confidence intervals of incident colorectal/colon cancer. Age standardized incidence and age adjusted risk ratios are compared to address the association between different categorization of BMI and colorectal cancers. Stratification by stage of life and by gender is conducted to evaluate effect modification by these factors. Results: People with higher BMI at baseline tend to have higher risk of colorectal cancer during almost 20 years of follow up The risk of colon cancer for people with moderate weight gain between age 21 and study baseline is 1.35(95% CI is 0.91 to 1.99) compared to people with constant or lower weight gains. If the weight change occurred between age 21 and age 65, the hazard ratio is 1.45 (95%CI is 0.93 to 2.25). Hazard ratio of weight gain between age 21 and study baseline for men and women is 1.23(95% CI is 0.63 to 2.41) and 1.16(95% CI is 0.65 to 2.07) respectively. Conclusion: The data from this study suggest that high BMI and high weight gain might increase the risk of colon but not rectal cancer. The life stage during which weight change is evaluated may modify the effect of weight change on risk of colon cancer. There is no significant effect modification of gender on the effects of weight gain

    Uncertainty Quantification over Graph with Conformalized Graph Neural Networks

    Full text link
    Graph Neural Networks (GNNs) are powerful machine learning prediction models on graph-structured data. However, GNNs lack rigorous uncertainty estimates, limiting their reliable deployment in settings where the cost of errors is significant. We propose conformalized GNN (CF-GNN), extending conformal prediction (CP) to graph-based models for guaranteed uncertainty estimates. Given an entity in the graph, CF-GNN produces a prediction set/interval that provably contains the true label with pre-defined coverage probability (e.g. 90%). We establish a permutation invariance condition that enables the validity of CP on graph data and provide an exact characterization of the test-time coverage. Moreover, besides valid coverage, it is crucial to reduce the prediction set size/interval length for practical use. We observe a key connection between non-conformity scores and network structures, which motivates us to develop a topology-aware output correction model that learns to update the prediction and produces more efficient prediction sets/intervals. Extensive experiments show that CF-GNN achieves any pre-defined target marginal coverage while significantly reducing the prediction set/interval size by up to 74% over the baselines. It also empirically achieves satisfactory conditional coverage over various raw and network features.Comment: Published at NeurIPS 202

    DSGD-CECA: Decentralized SGD with Communication-Optimal Exact Consensus Algorithm

    Full text link
    Decentralized Stochastic Gradient Descent (SGD) is an emerging neural network training approach that enables multiple agents to train a model collaboratively and simultaneously. Rather than using a central parameter server to collect gradients from all the agents, each agent keeps a copy of the model parameters and communicates with a small number of other agents to exchange model updates. Their communication, governed by the communication topology and gossip weight matrices, facilitates the exchange of model updates. The state-of-the-art approach uses the dynamic one-peer exponential-2 topology, achieving faster training times and improved scalability than the ring, grid, torus, and hypercube topologies. However, this approach requires a power-of-2 number of agents, which is impractical at scale. In this paper, we remove this restriction and propose \underline{D}ecentralized \underline{SGD} with \underline{C}ommunication-optimal \underline{E}xact \underline{C}onsensus \underline{A}lgorithm (DSGD-CECA), which works for any number of agents while still achieving state-of-the-art properties. In particular, DSGD-CECA incurs a unit per-iteration communication overhead and an O~(n3)\tilde{O}(n^3) transient iteration complexity. Our proof is based on newly discovered properties of gossip weight matrices and a novel approach to combine them with DSGD's convergence analysis. Numerical experiments show the efficiency of DSGD-CECA

    Engendering the City: A Participatory Approach to Gender-Responsive Planning and Urban Design in Cairo

    Get PDF
    The city of Cairo has witnessed a considerable increase in crimes against women, compelling women to avoid or minimise their use of public spaces in recent years. The absence of consideration for women in city planning has made Egyptian women feel further excluded and threatened by the public space, in addition to the patriarchal social relations and religious conservatism. As part of the ‘gender-inclusive cities' research project, this study adopts a participatory approach as a tool for women's empowerment with the goal of promoting bottom-up models of planning, dissolving gendered norms, and improving women's status in a patriarchal society. The chapter provides an example of localised gender-inclusive design addressing women's spatial sensibilities and connecting them to the broader objectives of participation and emancipation. The findings of this study can help planners and policy makers co-create safer public spaces for local women, reduce spatial inequality, and facilitate their right to the city

    Communication-Efficient Topologies for Decentralized Learning with O(1)O(1) Consensus Rate

    Full text link
    Decentralized optimization is an emerging paradigm in distributed learning in which agents achieve network-wide solutions by peer-to-peer communication without the central server. Since communication tends to be slower than computation, when each agent communicates with only a few neighboring agents per iteration, they can complete iterations faster than with more agents or a central server. However, the total number of iterations to reach a network-wide solution is affected by the speed at which the agents' information is ``mixed'' by communication. We found that popular communication topologies either have large maximum degrees (such as stars and complete graphs) or are ineffective at mixing information (such as rings and grids). To address this problem, we propose a new family of topologies, EquiTopo, which has an (almost) constant degree and a network-size-independent consensus rate that is used to measure the mixing efficiency. In the proposed family, EquiStatic has a degree of Θ(ln(n))\Theta(\ln(n)), where nn is the network size, and a series of time-dependent one-peer topologies, EquiDyn, has a constant degree of 1. We generate EquiDyn through a certain random sampling procedure. Both of them achieve an nn-independent consensus rate. We apply them to decentralized SGD and decentralized gradient tracking and obtain faster communication and better convergence, theoretically and empirically. Our code is implemented through BlueFog and available at \url{https://github.com/kexinjinnn/EquiTopo}Comment: NeurIPS 202

    Symmetry-Preserving Program Representations for Learning Code Semantics

    Full text link
    Large Language Models (LLMs) have shown promise in automated program reasoning, a crucial aspect of many security tasks. However, existing LLM architectures for code are often borrowed from other domains like natural language processing, raising concerns about their generalization and robustness to unseen code. A key generalization challenge is to incorporate the knowledge of code semantics, including control and data flow, into the LLM architectures. Drawing inspiration from examples of convolution layers exploiting translation symmetry, we explore how code symmetries can enhance LLM architectures for program analysis and modeling. We present a rigorous group-theoretic framework that formally defines code symmetries as semantics-preserving transformations and provides techniques for precisely reasoning about symmetry preservation within LLM architectures. Using this framework, we introduce a novel variant of self-attention that preserves program symmetries, demonstrating its effectiveness in generalization and robustness through detailed experimental evaluations across different binary and source code analysis tasks. Overall, our code symmetry framework offers rigorous and powerful reasoning techniques that can guide the future development of specialized LLMs for code and advance LLM-guided program reasoning tasks

    Haemophilus parasuis Infection Disrupts Adherens Junctions and Initializes EMT Dependent on Canonical Wnt/β-Catenin Signaling Pathway

    No full text
    In this study, animal experimentation verified that the canonical Wnt/β-catenin signaling pathway was activated under a reduced activity of p-β-catenin (Ser33/37/Thr41) and an increased accumulation of β-catenin in the lungs and kidneys of pigs infected with a highly virulent strain of H. parasuis. In PK-15 and NPTr cells, it was also confirmed that infection with a high-virulence strain of H. parasuis induced cytoplasmic accumulation and nuclear translocation of β-catenin. H. parasuis infection caused a sharp degradation of E-cadherin and an increase of the epithelial cell monolayer permeability, as well as a broken interaction between β-catenin and E-cadherin dependent on Wnt/β-catenin signaling pathway. Moreover, Wnt/β-catenin signaling pathway also contributed to the initiation of epithelial-mesenchymal transition (EMT) during high-virulence strain of H. parasuis infection with expression changes of epithelial/mesenchymal markers, increased migratory capabilities as well as the morphologically spindle-like switch in PK-15 and NPTr cells. Therefore, we originally speculated that H. parasuis infection activates the canonical Wnt/β-catenin signaling pathway leading to a disruption of the epithelial barrier, altering cell structure and increasing cell migration, which results in severe acute systemic infection characterized by fibrinous polyserositis during H. parasuis infection
    corecore